Reversible Architectures for Arbitrarily Deep Residual Neural Networks

نویسندگان

Bo Chang

Lili Meng

Eldad Haber

Lars Ruthotto

David Begert

Elliot Holtham

چکیده

Recently, deep residual networks have been successfully applied in many computer vision and natural language processing tasks, pushing the state-of-the-art performance with deeper and wider architectures. In this work, we interpret deep residual networks as ordinary differential equations (ODEs), which have long been studied in mathematics and physics with rich theoretical and empirical success. From this interpretation, we develop a theoretical framework on stability and reversibility of deep neural networks, and derive three reversible neural network architectures that can go arbitrarily deep in theory. The reversibility property allows a memory-efficient implementation, which does not need to store the activations for most hidden layers. Together with the stability of our architectures, this enables training deeper networks using only modest computational resources. We provide both theoretical analyses and empirical results. Experimental results demonstrate the efficacy of our architectures against several strong baselines on CIFAR-10, CIFAR-100 and STL-10 with superior or on-par state-of-the-art performance. Furthermore, we show our architectures yield superior results when trained using fewer training data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stable Architectures for Deep Neural Networks

Deep neural networks have become valuable tools for supervised machine learning, e.g., in the classification of text or images. While offering superior results over traditional techniques to find and express complicated patterns in data, deep architectures are known to be challenging to design and train such that they generalize well to new data. An important issue that must be overcome is nume...

متن کامل

Identity Matters in Deep Learning

An emerging design principle in deep learning is that each layer of a deep artificial neural network should be able to easily express the identity transformation. This idea not only motivated various normalization techniques, such as batch normalization, but was also key to the immense success of residual networks. In this work, we put the principle of identity parameterization on a more solid ...

متن کامل

Understanding Very Deep Networks via Volume Conservation

Recently, very deep neural networks set new records across many application domains, like Residual Networks at the ImageNet challenge and Highway Networks at language processing tasks. We expect further excellent performance improvements in different fields from these very deep networks. However these networks are still poorly understood, especially since they rely on non-standard architectures...

متن کامل

Going Deeper in Spiking Neural Networks: VGG and Residual Architectures

Over the past few years, Spiking Neural Networks (SNNs) have become popular as a possible pathway to enable low-power event-driven neuromorphic hardware. However, their application in machine learning have largely been limited to very shallow neural network architectures for simple problems. In this paper, we propose a novel algorithmic technique for generating an SNN with a deep architecture, ...

متن کامل

Residual Networks: Lyapunov Stability and Convex Decomposition

While training error of most deep neural networks degrades as the depth of the network increases, residual networks appear to be an exception. We show that the main reason for this is the Lyapunov stability of the gradient descent algorithm: for an arbitrarily chosen step size, the equilibria of the gradient descent are most likely to remain stable for the parametrization of residual networks. ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1709.03698 شماره

صفحات -

تاریخ انتشار 2017

Reversible Architectures for Arbitrarily Deep Residual Neural Networks

نویسندگان

چکیده

منابع مشابه

Stable Architectures for Deep Neural Networks

Identity Matters in Deep Learning

Understanding Very Deep Networks via Volume Conservation

Going Deeper in Spiking Neural Networks: VGG and Residual Architectures

Residual Networks: Lyapunov Stability and Convex Decomposition

عنوان ژورنال:

اشتراک گذاری